There is More than a Power Law in Zipf
نویسندگان
چکیده
The largest cities, the most frequently used words, the income of the richest countries, and the most wealthy billionaires, can be all described in terms of Zipf's Law, a rank-size rule capturing the relation between the frequency of a set of objects or events and their size. It is assumed to be one of many manifestations of an underlying power law like Pareto's or Benford's, but contrary to popular belief, from a distribution of, say, city sizes and a simple random sampling, one does not obtain Zipf's law for the largest cities. This pathology is reflected in the fact that Zipf's Law has a functional form depending on the number of events N. This requires a fundamental property of the sample distribution which we call 'coherence' and it corresponds to a 'screening' between various elements of the set. We show how it should be accounted for when fitting Zipf's Law.
منابع مشابه
Strong, Weak and False Inverse Power Laws
Pareto, Zipf and numerous subsequent investigators of inverse power distributions have often represented their findings as though their data conformed to a power law form for all ranges of the variable of interest. I refer to this ideal case as a strong inverse power law (SIPL). However, many of the examples used by Pareto and Zipf, as well as others who have followed them, have been truncated ...
متن کاملGeneralized (m,k)-Zipf law for fractional Brownian motion-like time series with or without effect of an additional linear trend
We have translated fractional Brownian motion (FBM) signals into a text based on two ”letters”, as if the signal fluctuations correspond to a constant stepsize random walk. We have applied the Zipf method to extract the ζ′ exponent relating the word frequency and its rank on a loglog plot. We have studied the variation of the Zipf exponent(s) giving the relationship between the frequency of occ...
متن کاملGeneralized (m,k)-Zipf law for fractional Brownian motion-like time series with or without effect of an additional linear trend
We have translated fractional Brownian motion (FBM) signals into a text based on two ”letters”, as if the signal fluctuations correspond to a constant stepsize random walk. We have applied the Zipf method to extract the ζ′ exponent relating the word frequency and its rank on a loglog plot. We have studied the variation of the Zipf exponent(s) giving the relationship between the frequency of occ...
متن کاملLanguage Learning, Power Laws, and Sexual Selection
A diagnostic of a power law distribution is that a log-log plot of frequency against rank yields a (nearly) straight line. For instance, Zipf (1935) plotted word token counts in a variety of texts against the inverse rank of each distinct word type and showed that typically such plots approximate a straight line. The characteristic ‘Zipf curve’ of word frequency against rank deviates from this ...
متن کاملZipf's law against the text size: a half-rational model
In this article, we consider Zipf-Mandelbrot law as applied to texts in natural languages. We present a simple model of dependence of the law on the text size, which is featured by variable power-law tail and constant ratio of the most frequent words. As a result we derive several closed formulas, which accord with empirical data qualitatively and partially quantitatively. For example, there ap...
متن کاملOn the Law of Zipf-Mandelbrot for Multi-Wort Phrases
The paper studies the probabilities of the occurrence of m word phrases (m=2,3, ...) in relation with the probabilities of occurrence of the single words. It is well-known that, in the latter case, the law of Zipf is valid (i.e. a power law). We prove that in the case of m word phrases (m22) this is not the case. We present two independent proofs of this. We furthermore show that in case we wan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2 شماره
صفحات -
تاریخ انتشار 2012